[Python] SSO using Flask, requests-oauthlib and pyjwt

I am currently developing an application that will need Single Sign-On to delegate user management and authorization to a third party Identity Provider (IdP). Although the domain of the application is not entirely finished and I have not started developing the API, the Proof of Concept of using the backend to generate and retrieve the JWT from an IdP is done and maybe it will be of use to guide other in this process.

This article is basically tying up two examples and an article, namely requests-oatuhlib’s Web App Example, pyjwt’s Retrieve RSA signing keys from a JWKS endpoint, and Auth0’s How to Verify a JWT. Adding to that, we also use Selenium’s outstanding API to open a browser session and wait for the authentication flow to complete before retrieving the session cookies.

Background

Before diving into the code in all its prototypical glory, I think it is important to at least refer to the OIDC authentication flow. If you feel comfortable with the flow already you can skip to Pre-requisites.

I think the best source of high level information about OIDC comes from Auth0 articles. What we are implementing here is a form of Authorization Code Flow, one of the many flow types supported by OIDC.

In this flow we are relying on redirecting the user from our backend’s /login endpoint to the Identity Provider authentication page, in order to generate code and state after a successful authentication. This ensures better compatibility at the expense of having a browser as a dependency for the backend. We will show how we can still leverage a browser in a CLI for authentication using Selenium.

The code and state is then sent to the backend’s /callback endpoint via redirect, which in turn generate the JWT token with user claims and send it back to the user using Flask’s native session API.

The last step is to extract the token from the session cookies. JWT is designed to be self contained and relying on Flask’s session adds a point of failure if we decide to rotate the application’s session secret.

Design Decisions

For this example we will rely heavily on OIDC’s configuration endpoint which is based on RFC5785 well-known URIs. This is supported by Auth0’s OIDC Discovery endpoint, which is the IdP I used to validate the code in this article.

All the necessary endpoints, like authorization_endpoint, token_endpoint and jwks_uri are present in the Auth0’s configuration endpoint response.

I can’t say that this will be supported by every IdP, but the examples in this article can be expanded to include explicitly defined endpoints for those three resources we need in order to authorize the user.

Pre-requesites

For the examples in this article you will need:

  • An Auth0 tenant

  • An Auth0 Application configured to accept connections from http://localhost:5000

  • A user in your Auth0 Tenant

  • A python virtual environment with the following requirements.txt installed:

All code was tested using python 3.9.9.

Authentication Flow

The first thing is to setup a flask app. We need to set app’s config SECRET_KEY so that we can later use flask’s session.

Then we declare our IdP settings:

Make sure to replace the necessary fields with your tenant openid configuration endpoint, client ID and secret.

You could also try to use a different Identity Provider. Just make sure you replace the scope list with the corresponding scopes. What we need is to be able to produce a JWT with identity claims containg email and username.

Now we can implement the endpoints for the auth flow, namely /login and /callback, but first let’s implement two functions that will help us fetch the well-known metadata and create a oauth2 session:

The first function, get_well_known_metadata will get and parse the json response of the well-known endpoint url. The second function, get_oauth2_session will create an OAuth2Session instance with out client id, scopes and a redirect_uri for our callback endpoint that we will implement next. The **kwargs will be used in the callback endpoint to pass the authorization state to the OAuth2Session to prevent CSRF attacks.

We can then implement our login and callback endpoints.

First, the login:

This piece of code is pretty straight forward and very similar to the original demo endpoint implemented in the requests-oauthlib Web App example.

We are producing an authorization URL to which we are going to redirect our user. The authorization_endpoint is extracted from the JSON response of the IdP’s well-known configuration endpoint. The state is saved in the session to be used later.

Then, we have the callback endpoint:

This endpoint creates another OAuth2Session instance, but passing the previous oauth state. We then use the token_endpoint, client_secret and the code from the query string to get the token metadata and save the id_token in the session.

I am not entirely sure the token_endpoint response fields are consistent across Identity Providers. The OAuth2 RFC6749 Secion 3.2 defines what the token endpoint is, but doesn’t seem to enforce the response schema.

By now we have the basic authentication flow to generate a valid token and populate the client’s session cookies with it. We now want to provide an endpoint so that the user can request the JWT token in plain text to be extracted from the session cookies:

This endpoint is very simple and just returns the oauth_token field included in the session, the one which we populated in the callback endpoint.

Validating the JWT token

Now that we have a valid JWT token both in the session cookies and in plaintext, we can create an interceptor to validate the token before the request is processed by whichever endpoint controller the request is being sent to.

We provide two ways to send the token to the backend, in the session cookies or using the Authorization header. The following code extracts the token and validates it using pyjwt:

The function get_jwks_client produces a PyJWKClient instance using the jwks_uri we retrieved from the IdP’s well-known configuration endpoint.

We have to make sure both login and callback endpoints are whitelisted from the token verification inteceptor, because these endpoints are necessary to produce the token in the first place.

Then we can test whether the token is in the Authorization header or in the session. If we can’t find it, we just return an Unauthorized response telling the user that the authorization token is missing.

To retrieve the token from the Authorization header we have to do a little string manipulation, because these headers are usually in the form Authorization: Basic <jwt>. Therefore we split the header value and get the second occurrence.

Then we can use the PyJWKClient example from pyjwt to exrtact the signing key. The example uses hardcoded algorithms, but we can circumvent it using jwt’s function get_unverified_header. The function will return the JWT token header before verification, which will include the alg field, telling which algorithm was used to sign the token. The use of this function was illustrated from Auth0’s article How to Handle JWT in Python, namely in the How to Verify a JWT session.

After decoding the token we can just populate a custom field in the request, called user_data which will contain all JWT claims to be used throughout the request lifespan. To illustrate the use, let’s create an endpoint to return the signed-in user’s email:

Calling this endpoint with either the Authorization header set or the original session cookies will return your Auth0 user’s email. Note that /user/id actually answers with the user’s email. This is because my application will eventually use the user’s email for unique identification. You can later change it to /user/email or return the actual user id, if you so desire.

Using the login flow in a CLI

Part of the scope of my original project is to provide means for the user to interact with the API using a CLI. However, to ensure the best consistency with the Identity Providers authentication flow implementation, I decided to implement a comunication flow that will open a browser session from the CLI and retrieve the JWT using the driver’s session cookies.

For this exercise we will use Selenium and chromium.

We start by creating a chromium driver that will direct the user to the backend’s login page:

This will open a browser session directly to the login page, without a tabs and URL bar, and with data persistence.

Before filling the login page with your credentials, we can first leverage Selenium’s WebDriverWait and automate both session cookies retrieval and driver closure.

The backend callback endpoint returns ok if the authentication flow is successful, so we can use that to wait for the response:

This is not robust at.. all… but serves to illustrate the concept. We could for instance return a div with a UUID for an ID and look for the element using Selenium’s class selectors.

The code will lock and wait until the ok is sent. You can now proceed to login with your client as usual.

When ok is sent, the wait lock is released and we can fetch the session cookies and close the driver:

With the session cookies at hand, we can proceed to use python’s requests to fetch the actual token. We need first to map the session cookies to something that requests understands. The chromium session cookies are in the following format:

Python requests expects cookies to be in the form of a key-value dictionary. We can convert the chromium cookies using dict comprehension:

Then we can retrieve the plaintext token using the /user/token endpoint:

Now we have a JWT token in our CLI that does not depend on the backend application secret.

Let’s test the token by sending a request to the /user/id:

This should produce your user’s email.

Conclusion

We laid in this article a simple building block with which we can build a backend with OIDC SSO and a frontend CLI that successfully communicates with the backend.

The examples in this article are fairly simple and a direct result of the well-documented API’s it uses, but they can hopefully serve as a starting point for others to build their applications expanding upon them.

The complete, uninterrupted code for both the backend and the CLI prototype can be found here: https://gist.github.com/gchamon/0c8632bfd32aea9a6a5a558f823e7a24

If you have any questions or suggestions, please feel free to contact me anytime!

The Digital Meadow logo
Subscribe to The Digital Meadow and never miss a post.
#python#oidc#sso#selenium